The following assignment demonstrates how to read and manipulate energy data files in order to create bar plots and geospatial maps.
PGE puts out Excel files of energy use data for each quarter that anyone can download and analyze. I downloaded the 2017 - 2020 data from their site (with 2020 only having Q1 and Q2 so far).
I then restricted zip codes to the Bay Area in order to analyze energy usage in the Bay Area from 2017 to 2020 and changes in residential electric energy use after the pandemic began in Q2 of 2020.
Part 1: This part of the assignment creates a bar plot of residential and commercial electric and gas energy data for Bay Area zip codes from quarter 1 of 2017 to quarter 2 of 2020. At the time of analysis, data from quarters 3 and 4 of 2020 were not yet available so they were not included, but could potentially be in the future.
This imports the libraries needed:
Creates a data set for all of the electricity use data from Q1 2017 to Q2 2020:
## [1] "PGE_2017_Q1_ElectricUsageByZip.csv"
## [1] "PGE_2017_Q2_ElectricUsageByZip.csv"
## [1] "PGE_2017_Q3_ElectricUsageByZip.csv"
## [1] "PGE_2017_Q4_ElectricUsageByZip.csv"
## [1] "PGE_2018_Q1_ElectricUsageByZip.csv"
## [1] "PGE_2018_Q2_ElectricUsageByZip.csv"
## [1] "PGE_2018_Q3_ElectricUsageByZip.csv"
## [1] "PGE_2018_Q4_ElectricUsageByZip.csv"
## [1] "PGE_2019_Q1_ElectricUsageByZip.csv"
## [1] "PGE_2019_Q2_ElectricUsageByZip.csv"
## [1] "PGE_2019_Q3_ElectricUsageByZip.csv"
## [1] "PGE_2019_Q4_ElectricUsageByZip.csv"
## [1] "PGE_2020_Q1_ElectricUsageByZip.csv"
## [1] "PGE_2020_Q2_ElectricUsageByZip.csv"
Filters the electricity data so that the data frame only contains total energy usage and converts the energy data values from kWh to kBTU. The energy is converted to kBTU so that it can also be compared with gas data which is usually in therms. Therms will also be converted to kBTU:
## # A tibble: 72,649 x 5
## # Groups: MONTH, CUSTOMERCLASS [24]
## ZIPCODE MONTH YEAR CUSTOMERCLASS TOTALKBTU
## <dbl> <dbl> <dbl> <chr> <dbl>
## 1 93101 1 2017 Elec- Commercial 0
## 2 93101 2 2017 Elec- Commercial 0
## 3 93101 3 2017 Elec- Commercial 0
## 4 93105 1 2017 Elec- Commercial 0
## 5 93105 2 2017 Elec- Commercial 0
## 6 93105 3 2017 Elec- Commercial 0
## 7 93110 1 2017 Elec- Commercial 0
## 8 93110 2 2017 Elec- Commercial 0
## 9 93110 3 2017 Elec- Commercial 0
## 10 93117 1 2017 Elec- Commercial 0
## # ... with 72,639 more rows
This creates a data set for all of the gas use data from Q1 2017 to Q2 2020:
## [1] "PGE_2017_Q1_GasUsageByZip.csv"
## [1] "PGE_2017_Q2_GasUsageByZip.csv"
## [1] "PGE_2017_Q3_GasUsageByZip.csv"
## [1] "PGE_2017_Q4_GasUsageByZip.csv"
## [1] "PGE_2018_Q1_GasUsageByZip.csv"
## [1] "PGE_2018_Q2_GasUsageByZip.csv"
## [1] "PGE_2018_Q3_GasUsageByZip.csv"
## [1] "PGE_2018_Q4_GasUsageByZip.csv"
## [1] "PGE_2019_Q1_GasUsageByZip.csv"
## [1] "PGE_2019_Q2_GasUsageByZip.csv"
## [1] "PGE_2019_Q3_GasUsageByZip.csv"
## [1] "PGE_2019_Q4_GasUsageByZip.csv"
## [1] "PGE_2020_Q1_GasUsageByZip.csv"
## [1] "PGE_2020_Q2_GasUsageByZip.csv"
This filters the gas data so that the data frame only contains total energy usage and converts from therms to kBTU:
## # A tibble: 50,055 x 5
## # Groups: MONTH, CUSTOMERCLASS [24]
## ZIPCODE MONTH YEAR CUSTOMERCLASS TOTALKBTU
## <dbl> <dbl> <dbl> <chr> <dbl>
## 1 92304 1 2017 Gas- Commercial 0
## 2 92304 2 2017 Gas- Commercial 0
## 3 92304 3 2017 Gas- Commercial 0
## 4 92365 1 2017 Gas- Commercial 0
## 5 92365 2 2017 Gas- Commercial 0
## 6 92365 3 2017 Gas- Commercial 0
## 7 93203 1 2017 Gas- Commercial 0
## 8 93203 2 2017 Gas- Commercial 0
## 9 93203 3 2017 Gas- Commercial 0
## 10 93204 1 2017 Gas- Commercial 0
## # ... with 50,045 more rows
This binds the total electricity use and gas use data into a dataframe that will then be plotted:
This filters out zip code data for the Bay Area:
## Simple feature collection with 44668 features and 5 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -123.5335 ymin: 36.89303 xmax: -121.3083 ymax: 38.93713
## Geodetic CRS: WGS 84
## # A tibble: 44,668 x 6
## # Groups: ZIPCODE, MONTH, YEAR [12,189]
## ZIPCODE MONTH YEAR CUSTOMERCLASS TOTALKBTU geometry
## <chr> <dbl> <dbl> <chr> <dbl> <MULTIPOLYGON [°]>
## 1 94002 1 2017 Elec- Commerc~ 1.41e10 (((-122.3359 37.50805, -122.330~
## 2 94002 1 2017 Elec- Residen~ 1.87e10 (((-122.3359 37.50805, -122.330~
## 3 94002 1 2017 Gas- Commerci~ 0 (((-122.3359 37.50805, -122.330~
## 4 94002 1 2017 Gas- Resident~ 8.35e10 (((-122.3359 37.50805, -122.330~
## 5 94002 1 2018 Elec- Commerc~ 1.39e10 (((-122.3359 37.50805, -122.330~
## 6 94002 1 2018 Elec- Residen~ 1.73e10 (((-122.3359 37.50805, -122.330~
## 7 94002 1 2018 Gas- Commerci~ 1.26e10 (((-122.3359 37.50805, -122.330~
## 8 94002 1 2018 Gas- Resident~ 6.40e10 (((-122.3359 37.50805, -122.330~
## 9 94002 1 2019 Elec- Commerc~ 1.44e10 (((-122.3359 37.50805, -122.330~
## 10 94002 1 2019 Elec- Residen~ 1.75e10 (((-122.3359 37.50805, -122.330~
## # ... with 44,658 more rows
Filters the energy data of the Bay Area to just the factors needed for the analysis: year, month, and customerclass:
## Simple feature collection with 168 features and 4 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -123.5335 ymin: 36.89303 xmax: -121.3083 ymax: 38.93713
## Geodetic CRS: WGS 84
## # A tibble: 168 x 5
## # Groups: YEAR, MONTH [42]
## YEAR MONTH CUSTOMERCLASS TOTALKBTU geometry
## <dbl> <dbl> <chr> <dbl> <MULTIPOLYGON [°]>
## 1 2017 1 Elec- Commerci~ 5.63e12 (((-122.9915 37.99445, -122.9942 37.99~
## 2 2017 1 Elec- Resident~ 4.80e12 (((-122.9895 38.28024, -122.9946 38.28~
## 3 2017 1 Gas- Commercial 4.60e12 (((-123.1361 38.75497, -123.1409 38.75~
## 4 2017 1 Gas- Residenti~ 1.79e13 (((-122.6139 37.87843, -122.6174 37.87~
## 5 2017 2 Elec- Commerci~ 4.69e12 (((-122.9915 37.99445, -122.9942 37.99~
## 6 2017 2 Elec- Resident~ 3.82e12 (((-122.9895 38.28024, -122.9946 38.28~
## 7 2017 2 Gas- Commercial 3.29e12 (((-123.1361 38.75497, -123.1409 38.75~
## 8 2017 2 Gas- Residenti~ 1.18e13 (((-122.6139 37.87843, -122.6174 37.87~
## 9 2017 3 Elec- Commerci~ 4.91e12 (((-122.9915 37.99445, -122.9942 37.99~
## 10 2017 3 Elec- Resident~ 3.70e12 (((-122.9895 38.28024, -122.9946 38.28~
## # ... with 158 more rows
Combines YEAR and MONTH into one column so that the information is more clearly summarized in the plot:
Further filters the energy data into date and month, customer class, and total kBTU:
## Simple feature collection with 168 features and 3 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -123.5335 ymin: 36.89303 xmax: -121.3083 ymax: 38.93713
## Geodetic CRS: WGS 84
## # A tibble: 168 x 4
## dates CUSTOMERCLASS TOTALKBTU geometry
## <chr> <chr> <dbl> <MULTIPOLYGON [°]>
## 1 2017-1 Elec- Commerc~ 5.63e12 (((-122.9915 37.99445, -122.9942 37.99318, -~
## 2 2017-1 Elec- Residen~ 4.80e12 (((-122.9895 38.28024, -122.9946 38.2831, -1~
## 3 2017-1 Gas- Commerci~ 4.60e12 (((-123.1361 38.75497, -123.1409 38.75663, -~
## 4 2017-1 Gas- Resident~ 1.79e13 (((-122.6139 37.87843, -122.6174 37.87723, -~
## 5 2017-2 Elec- Commerc~ 4.69e12 (((-122.9915 37.99445, -122.9942 37.99318, -~
## 6 2017-2 Elec- Residen~ 3.82e12 (((-122.9895 38.28024, -122.9946 38.2831, -1~
## 7 2017-2 Gas- Commerci~ 3.29e12 (((-123.1361 38.75497, -123.1409 38.75663, -~
## 8 2017-2 Gas- Resident~ 1.18e13 (((-122.6139 37.87843, -122.6174 37.87723, -~
## 9 2017-3 Elec- Commerc~ 4.91e12 (((-122.9915 37.99445, -122.9942 37.99318, -~
## 10 2017-3 Elec- Residen~ 3.70e12 (((-122.9895 38.28024, -122.9946 38.2831, -1~
## # ... with 158 more rows
This creates a plot of the data:
Part 2: This part of the assignment analyzes residential energy usage for Bay Area block groups before and after the pandemic began. The pandemic technically began in quarter 1 of 2020, but quarter 2 of 2020 are where the changes more most likely seen because shelter-in-place began late in quarter 1 of 2020. Therefore, comparing the same quarter from the year before, quarter 2 of 2019 to quarter 2 of 2020 would most likely show a difference in residential electric energy usage. Comparing the same quarter but in different years can also help decrease the chance that changes in residential electric energy usage are not just due to changes of the seasons.
Imports the CSV file of Q2 2019, before pandemic began:
Filters out the residential electricity data and converts to kBTU:
## # A tibble: 861 x 2
## ZIPCODE AVERAGEKBTU_19
## <dbl> <dbl>
## 1 92356 NaN
## 2 93101 NaN
## 3 93105 NaN
## 4 93110 NaN
## 5 93117 NaN
## 6 93201 5659507.
## 7 93203 5428005.
## 8 93204 4803284.
## 9 93206 6942065.
## 10 93210 5138706.
## # ... with 851 more rows
Imports the CSV file of Q2 2020, after the pandemic began:
Filters out the residential electricity data and converts to kBTU:
## # A tibble: 854 x 2
## ZIPCODE AVERAGEKBTU_20
## <dbl> <dbl>
## 1 93101 NaN
## 2 93105 NaN
## 3 93110 NaN
## 4 93117 NaN
## 5 93201 6544425.
## 6 93203 6040063.
## 7 93204 5508462.
## 8 93206 7457756.
## 9 93210 5781853.
## 10 93212 6236243.
## # ... with 844 more rows
Filters the data so that only energy data of Bay Area block groups is included:
Creates a data frame that has energy data for the second quarter of 2020
## Simple feature collection with 1763 features and 3 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -123.5335 ymin: 36.89303 xmax: -121.3083 ymax: 38.93713
## Geodetic CRS: WGS 84
## # A tibble: 1,763 x 4
## ZIPCODE AVERAGEKBTU_19 geometry AVERAGEKBTU_20
## * <chr> <dbl> <MULTIPOLYGON [°]> <dbl>
## 1 94002 3674313. (((-122.3359 37.50805, -122.3302 37.51~ 4000585.
## 2 94002 3674313. (((-122.3359 37.50805, -122.3302 37.51~ 7052981.
## 3 94002 3674313. (((-122.3359 37.50805, -122.3302 37.51~ 4176229.
## 4 94002 3674313. (((-122.3359 37.50805, -122.3302 37.51~ 4569354.
## 5 94002 3674313. (((-122.3359 37.50805, -122.3302 37.51~ 4656987.
## 6 94002 3674313. (((-122.3359 37.50805, -122.3302 37.51~ 3936503.
## 7 94002 3674313. (((-122.3359 37.50805, -122.3302 37.51~ 3918472.
## 8 94005 3408835. (((-122.442 37.69435, -122.44 37.69498~ 3785718.
## 9 94005 3408835. (((-122.442 37.69435, -122.44 37.69498~ 4142974.
## 10 94005 3408835. (((-122.442 37.69435, -122.44 37.69498~ 3746898.
## # ... with 1,753 more rows
Creates a new column that takes the difference in values of kBTU energy usage of quater 2 of 2020 from quarter 2 of 2020. This is possible because the same zip codes are used for both columns of data:
Creates an color coded map of energy usage for quarter 2 of 2019. The more intense red color indicates greater residential electric energy usage for that block group:
Creates an color coded map of energy usage for quarter 2 of 2020. The more intense blue color indicates greater residential electric energy usage for that block group:
Creates an color coded map of the difference in energy usage for block groups. The more intense green color indicates a greater change residential electric energy usage for that block group from quarter 2 of 2019 to quarter 2 of 2020, or change from before and after the pandemic: